The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
######################### Warning from 'xts' package ##########################
# #
# The dplyr lag() function breaks how base R's lag() function is supposed to #
# work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or #
# source() into this session won't work correctly. #
# #
# Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
# conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop #
# dplyr from breaking base R's lag() function. #
# #
# Code in packages is not affected. It's protected by R's namespace mechanism #
# Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning. #
# #
###############################################################################
Attaching package: 'xts'
The following objects are masked from 'package:dplyr':
first, last
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ lubridate 1.9.3 ✔ readr 2.1.5
✔ purrr 1.0.2 ✔ tibble 3.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ xts::first() masks dplyr::first()
✖ dplyr::lag() masks stats::lag()
✖ xts::last() masks dplyr::last()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
ggplot(energy_data_annual, aes(x=Year)) +geom_line(aes(y=`Total Fossil Fuels Production (Quadrillion Btu)`, color='Fossil Fuels Production'), size=1) +geom_line(aes(y=`Nuclear Electric Power Production (Quadrillion Btu)`, color='Nuclear Power Production'), size=1) +geom_line(aes(y=`Total Renewable Energy Production (Quadrillion Btu)`, color='Renewable Energy Production'), size=1) +labs(title='Primary Energy Production',x ='Year',y ='Production (Quadrillion Btu)',caption ='Data Source: U.S. Energy Information Association', ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() ) +scale_color_manual(values =c('Fossil Fuels Production'='red', 'Nuclear Power Production'='blue', 'Renewable Energy Production'='green'))
The graph illustrates the trend in primary energy production from 1950 to 2020. Initially, there is a steady increase from approximately 28 quadrillion Btu in 1950 to 59 quadrillion Btu by 1970. This is followed by a plateau in production from 1970 to 2010. After 2010, there is a noticeable spike in production, which may be attributed to advancements in high-performance computing in large data centers which needs high energy.
Fossil Fuels Production : This line shows a significant increase over the years, indicating a substantial rise in energy production. It suggests that this energy source has been the dominant contributor to primary energy production.
Nuclear and Renewable Energy Production : These lines remain relatively flat compared to the fossil fuel, indicating that these energy sources have contributed less to the overall primary energy production. They show slight increases over time but are not as pronounced as the fossil fuel.
3.2.2 Primary Energy Consumption
Code
ggplot(energy_data_annual, aes(x=Year)) +geom_line(aes(y=`Total Fossil Fuels Consumption (Quadrillion Btu)`, color='Fossil Fuels Consumption'), size=1) +geom_line(aes(y=`Nuclear Electric Power Consumption (Quadrillion Btu)`, color='Nuclear Power Consumption'), size=1) +geom_line(aes(y=`Total Renewable Energy Consumption (Quadrillion Btu)`, color='Renewable Energy Consumption'), size=1) +labs(title='Primary Energy Consumption',x='Year',y='Consumption (Quadrillion Btu)',caption ='Data Source: U.S. Energy Information Association' ) +scale_color_manual(values =c('Fossil Fuels Consumption'='red','Nuclear Power Consumption'='blue','Renewable Energy Consumption'='green' ) ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() )
Fossil Fuels Consumption : This is the dominant source of energy consumption throughout the period. There is a steady increase from 1950 to around 2005, with some fluctuations. After 2005, the consumption plateaus with minor ups and downs.
Nuclear Power Consumption : This energy consumption starts to become little significant around the late 1960s and early 1970s. It shows gradual growth until about 2000, after which it stabilizes.
Renewable Energy Consumption : This energy consumption begins to rise noticeably in the late 1990s. It shows a steady increase, especially post-2000, and appears to be catching up with nuclear power by the end of the period.
The fossil fuels make a large chunk of energy consumption throughout the years. The other energy consumption source like nuclear and renewable has very little contribution. There is a serious need of investments in these energy sources in order to catch up or reduce the dependence of fossil fuels.
3.2.3 Primary Energy Imports and Exports
Code
ggplot(energy_data_annual, aes(x=Year, y=`Primary Energy Net Imports (Quadrillion Btu)`)) +geom_bar(stat ='identity', fill='orange', color='black') +labs(title='Primary Energy Net Imports',x ='Year',y ='Energy (Quadrillion Btu)',caption ='Data Source: U.S. Energy Information Associaton' ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'), )
The chart shows the net imports of primary energy into the United States over time.
Observations :
1950s to Early 1970s: The net energy imports were relatively low and stable. This period shows minimal dependency on energy imports.
Mid-1970s to Early 1980s: There was a noticeable increase in net energy imports, likely due to rising energy demands and geopolitical events affecting oil supply. Since the 1970s, the global oil trade has been predominantly conducted in U.S. dollars (USD), creating a symbiosis between America’s currency and the world’s most traded commodity. The petrodollar emerged as an economic concept in the 1970s as growing U.S. imports of increasingly costly crude oil increased the dollar holdings of foreign producers.
1980s to Early 2000s: A significant rise in net imports occurred, peaking around the mid-2000s. This reflects increased energy consumption and reliance on foreign energy sources. The U.S. experienced growing energy demands driven by economic expansion and technological advancements. This led to higher consumption of oil and natural gas. The U.S. became increasingly reliant on foreign oil, with imports rising significantly.
Mid-2000s to Present: There is a sharp decline in net imports, eventually turning negative. This indicates that the U.S. became a net exporter of primary energy. Factors contributing to this include increased domestic energy production (especially from shale gas and oil), improved energy efficiency, and shifts towards renewable energy sources. (Source: U.S. Energy Independence)
Overall, the chart illustrates a transition from high dependency on imported energy to a position where the U.S. exports more energy than it imports.
3.2.4 Energy Imports vs Energy Consumption
Code
ggplot(energy_data_monthly, aes(x=Month)) +geom_line(aes(y=`Primary Energy Imports (Quadrillion Btu)`, color='Primary Energy Imports'), size=0.5) +geom_line(aes(y=`Total Primary Energy Consumption (Quadrillion Btu)`, color='Total Primary Energy Consumption'), size=0.5) +labs(title ='Energy Consumption and Energy Imports',x='Timeline',y='Energy (Quadrillion Btu)',caption ='Data Source: U.S. Energy Information Association' ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() )
Code
ggplot(energy_data_annual, aes(x =`Primary Energy Imports (Quadrillion Btu)`, y =`Total Primary Energy Consumption (Quadrillion Btu)`)) +geom_point(color ="blue", size =1) +geom_smooth(method ="lm", color ="red", se =TRUE) +labs(title ="Energy Dependency Analysis",x ="Primary Energy Imports (Quadrillion Btu)",y ="Total Primary Energy Consumption (Quadrillion Btu)",subtitle =paste("Pearson Correlation Coefficient:", round(cor(energy_data_annual$`Primary Energy Imports (Quadrillion Btu)`, energy_data_annual$`Total Primary Energy Consumption (Quadrillion Btu)`, method ="pearson"), 2)),caption ='Data Source: U.S. Energy Information Association' ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),plot.subtitle =element_text(hjust=0.5, color='purple') )
`geom_smooth()` using formula = 'y ~ x'
The two graphs provide a comprehensive analysis of energy consumption and import patterns over time and it reveals important trends and relationships.
Consumption Trends
Total Primary Energy Consumption shows a steady upward trajectory from around 5 Quadrillion Btu to approximately 7.5 Quadrillion Btu each month starting from January 1973 and ending on August 2024. Notable seasonal fluctuations appear throughout the timeline with regular peaks and troughs. The overall consumption pattern demonstrates consistent growth despite short-term variations
Import Patterns
Primary Energy Imports started at roughly 1.5 Quadrillion Btu each month in the 1970s. Imports peaked around 2005-2010 at approximately 3 Quadrillion Btu each month. A notable decline in imports occurred after 2010, stabilizing at about 2 Quadrillion Btu each month by 2020.
Statistical Relationship
The scatter plot reveals a strong positive correlation between imports and consumption. The Pearson Correlation Coefficient of 0.95 indicates an extremely strong linear relationship between consumption and import. The narrow confidence interval (gray shading) suggests high prediction reliability. The regression line shows a clear positive slope, indicating that higher imports generally correspond to higher consumption. Data points cluster tightly around the regression line, particularly in the middle range. The relationship remains consistent across different levels of imports and consumption.
Key Insights
- Despite growing total energy consumption, there’s a decreasing reliance on imports in recent years.
- The gap between consumption and imports has widened over time, suggesting increased domestic energy production or diversification of energy sources.
- The seasonal variations in consumption are more pronounced than fluctuations in imports, indicating stable import patterns despite varying demand.
3.2.5 Trends in Types of Fossil Fuel Production
Code
fossil_fuel <- energy_data_annual |>select(`Year`,`Coal Production (Quadrillion Btu)`,`Natural Gas (Dry) Production (Quadrillion Btu)`,`Crude Oil Production (Quadrillion Btu)`,`Natural Gas Plant Liquids Production (Quadrillion Btu)` ) |>pivot_longer(cols =-Year,names_to ='Energy Type',values_to ='Production (Quadrillion Btu)' ) |>mutate(`Energy Type`=str_replace_all(`Energy Type`, " \\(Quadrillion Btu\\)", '') )ggplot(fossil_fuel, aes(x=`Year`, y=`Production (Quadrillion Btu)`, fill=`Energy Type`))+geom_bar(stat ='identity') +labs(title='Fossil Fuel Production',y='Production (Quadrillion Btu)',x='Year',caption ='Data Source: U.S. Energy Information Association' ) +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() )
This chart shows how much fossil fuel was produced from around 1949 to 2023, measured in Quadrillion Btu (British thermal units). The chart uses a stacked area format, meaning each colored section is stacked on top of the others, making it easy to see both individual contributions and the total combined production at any point in time.
Let us break down what we can see:
Coal (Red) - The red bars at the top show coal production. It stayed fairly steady from 1949 to 1980, then increased after 1980 but eventually decreased around late 2010s and the trend continues till now.
Crude Oil (Green) - The green section represents crude oil production. It grew steadily from 1950 to 1980, remained relatively flat for many years, and then showed another strong increase from around 2010 to 2020.
Natural Gas (Blue) - The blue section shows dry natural gas production. It has grown consistently over the years, with a particularly sharp increase from 2005 to 2020. It shows the majority of energy needs in United States are fulfilled by Natural Gas.
Natural Gas Plant Liquids (Purple) - The purple section at the bottom represents natural gas plant liquids production. This has remained the smallest portion throughout the entire period, showing only modest growth.
The total fossil fuel production has grown dramatically from about 28 quadrillion Btu in 1949 to nearly 85 quadrillion Btu by 2023. The most dramatic growth happened in the last decade shown on the chart (2010-2020), where all types of fossil fuels, except natural gas plant liquids, showed significant increases in production.
Code
fossil_fuel_2023 <- fossil_fuel |>filter(Year==2023)ggplot(fossil_fuel_2023, aes(x='', y=`Production (Quadrillion Btu)`, fill=`Energy Type`)) +geom_bar(stat='identity', width=1) +coord_polar('y', start=0) +labs(title='% Fossil Fuel Production in 2023',y='',x='',fill='Energy Type',caption ='Data Source: U.S. Energy Information Association' ) +theme_minimal() +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.title =element_blank(),axis.text =element_blank(),panel.grid =element_blank() ) +geom_text(aes(label =sprintf("%.1f%%", `Production (Quadrillion Btu)`/sum(`Production (Quadrillion Btu)`) *100)),position =position_stack(vjust =0.5) )
This pie chart shows how different fossil fuels were produced in the United States during 2023.
Natural gas (dry) was the biggest part, making up 45.4% of all fossil fuel production.
Crude oil was the second largest at 31.1%.
Coal made up 13.6%.
Natural gas plant liquids were the smallest portion at 9.8%.
Together, all forms of natural gas (dry and liquids) accounted for more than half of all fossil fuel production at 55.2%.
3.2.6 Trends in Types of Renewable Energy Production
Code
renewable_energy <- energy_data_annual |>select(`Year`,`Hydroelectric Power Production (Quadrillion Btu)`,`Geothermal Energy Production (Quadrillion Btu)`, `Solar Energy Production (Quadrillion Btu)`,`Wind Energy Production (Quadrillion Btu)`, `Biomass Energy Production (Quadrillion Btu)` ) |>pivot_longer(cols=-`Year`,names_to ='Energy Type',values_to ='Production (Quadrillion Btu)' ) |>mutate(`Energy Type`=str_replace_all(`Energy Type`, ' \\(Quadrillion Btu\\)', '') ) |>mutate(`Production (Quadrillion Btu)`=ifelse(is.na(`Production (Quadrillion Btu)`), 0, `Production (Quadrillion Btu)`) )ggplot(renewable_energy, aes(x=`Year`, y=`Production (Quadrillion Btu)`, fill=`Energy Type`))+geom_bar(stat ='identity') +labs(title='Renewable Energy Production',y='Production (Quadrillion Btu)',x='Year',cpation ='Data Source: U.S. Energy Information Association' ) +scale_fill_brewer(palette ="Set1") +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() ) +guides(fill =guide_legend(nrow=2) )
This graph shows how renewable energy production has changed from around 1960 to 2020. Let us break down the key observations:
The total renewable energy production has significantly increased over time, reaching about 8 quadrillion BTU (British Thermal Units) by 2020. The graph shows a particularly sharp increase from 2000 onwards.
Biomass Energy (Red)
- Has consistently been the largest source.
- Shows steady growth, especially after 2000.
- Makes up roughly half of all renewable energy production.
Hydroelectric Power (Green)
- Represents the second-largest source.
- Has remained relatively stable over the decades.
- Shows some fluctuations but maintains consistent production levels.
Wind Energy (Orange)
- Virtually non-existent before 2000.
- Shows dramatic growth in the last two decades.
- Becomes a significant contributor by 2020.
Solar Energy (Purple)
- Begins to appear noticeably after 2010.
- Shows rapid recent growth.
- Still represents a smaller portion compared to other sources.
Geothermal Energy (Blue)
- Maintains a very small but steady contribution.
- Shows minimal growth over the entire period.
The chart can be divided into 3 time periods as shown below.
- 1960-1980: Relatively stable production, dominated by biomass and hydroelectric.
- 1980-2000: Modest growth with similar source distribution.
- 2000-2020: Rapid expansion with diversification into wind and solar energy.
The visualization clearly demonstrates the transition toward a more diverse renewable energy portfolio, with newer technologies like wind and solar becoming increasingly important contributors to the overall energy mix.
Code
renewable_energy_2023 <- renewable_energy |>filter(`Year`==2023)ggplot(renewable_energy_2023, aes(x='', y=`Production (Quadrillion Btu)`, fill=`Energy Type`)) +geom_bar(stat='identity', width=1) +coord_polar('y', start=0) +labs(title='% Renewable Energy Production in 2023',y='',x='',fill='Energy Type',caption ='Data Source: U.S. Energy Information Association' ) +theme_minimal() +scale_fill_brewer(palette ="Set1") +theme(plot.title =element_text(hjust=0.5, face='bold', color='darkblue'),legend.title =element_blank(),axis.text =element_blank(),panel.grid =element_blank() ) +geom_text(aes(label =sprintf("%.1f%%", `Production (Quadrillion Btu)`/sum(`Production (Quadrillion Btu)`) *100)),position =position_stack(vjust =0.5),color='white' )
The pie chart shows the breakdown of renewable energy production in the United States for 2023. Biomass energy is the dominant source and produces more than half of all renewable energy, accounting for 61.2% of total renewable energy production. Wind energy follows as the second-largest contributor at 17.0%, while solar energy represents 10.4% of the total. Hydroelectric power makes up 9.9%, and geothermal energy has the smallest share at 1.4%.
3.2.7 Trends in Primary Energy Production
Code
primary_energy <- energy_data_annual |>select(`Year`,`Total Fossil Fuels Production (Quadrillion Btu)`,`Total Renewable Energy Production (Quadrillion Btu)`,`Total Primary Energy Production (Quadrillion Btu)` ) |>pivot_longer(cols =-`Year`,names_to ='Production Type',values_to ='Production (Quadrillion Btu)' ) |>mutate(`Production Type`=str_replace_all(`Production Type`, '\\(Quadrillion Btu\\)', '') ) |>mutate(`Production Type`=fct_reorder(`Production Type`, desc(`Production (Quadrillion Btu)`)) )ggplot(primary_energy, aes(x = Year, y =`Production (Quadrillion Btu)`, fill =`Production Type`)) +geom_area(color='black', size=0.5) +labs(title ='Renewable vs Fossil vs Total Energy Production',x='Year',y='Production (Quadrillion Btu)',caption ='Data Source: U.S. Energy Information Association' ) +theme(plot.title =element_text(face='bold', hjust=0.5, color='darkblue'),legend.position ='bottom',legend.box ='horizontal',legend.title =element_blank() )
Production Growth
- In 1950, total energy production started at around 30 Quadrillion BTU.
- By 2023, it reached nearly 102 Quadrillion BTU.
- The growth was relatively steady until 2000, after which it accelerated to a good extent.
Fossil Fuels vs Renewables
- Fossil fuels (green) make up the majority of energy production throughout the entire period.
- Renewable energy (blue) represents a much smaller portion, visible as a thin blue strip at the bottom.
- The gap between renewable and fossil fuels is quite pronounced and it indicates substantial investment is required to transform the energy sector and move towards the production of environmently friendly green energy.
The visualization effectively shows how our energy production system remains heavily dependent on fossil fuels, despite the growing contribution from renewable sources.
3.3 Sectorwise Energy Consumption Analysis
3.3.1 Energy Overview by Residential Sector
Code
rs_energy_consumed <-xts(x = energy_data_monthly$`Total Energy Consumed by the Residential Sector (Trillion Btu)`, order.by = energy_data_monthly$Month)rs_energy_loss <-xts(x = energy_data_monthly$`Residential Sector Electrical System Energy Losses (Trillion Btu)`, order.by = energy_data_monthly$Month)dygraph(cbind(rs_energy_consumed, rs_energy_loss), main='Energy Consumed vs Energy Loss (Residential Sector)') |>dySeries('rs_energy_consumed', label ='Energy Consumed') |>dySeries('rs_energy_loss', label ='Energy Loss') |>dyRangeSelector() |>dyOptions(stackedGraph =TRUE, drawPoints =TRUE, pointSize =2) |>dyAxis("x", label ="Timeline") |>dyAxis("y", label ="Energy (Trillion Btu)")
3.3.2 Energy Overview by Commercial Sector
Code
rs_energy_consumed <-xts(x = energy_data_monthly$`Total Energy Consumed by the Commercial Sector (Trillion Btu)`, order.by = energy_data_monthly$Month)rs_energy_loss <-xts(x = energy_data_monthly$`Commercial Sector Electrical System Energy Losses (Trillion Btu)`, order.by = energy_data_monthly$Month)dygraph(cbind(rs_energy_consumed, rs_energy_loss), main='Energy Consumed vs Energy Loss (Commercial Sector)') |>dySeries('rs_energy_consumed', label ='Energy Consumed') |>dySeries('rs_energy_loss', label ='Energy Loss') |>dyRangeSelector() |>dyOptions(stackedGraph =TRUE, drawPoints =TRUE, pointSize =2) |>dyAxis("x", label ="Timeline") |>dyAxis("y", label ="Energy (Trillion Btu)")
3.3.3 Energy Overview by Industrial Sector
Code
rs_energy_consumed <-xts(x = energy_data_monthly$`Total Energy Consumed by the Industrial Sector (Trillion Btu)`, order.by = energy_data_monthly$Month)rs_energy_loss <-xts(x = energy_data_monthly$`Industrial Sector Electrical System Energy Losses (Trillion Btu)`, order.by = energy_data_monthly$Month)dygraph(cbind(rs_energy_consumed, rs_energy_loss), main='Energy Consumed vs Energy Loss (Industrial Sector)') |>dySeries('rs_energy_consumed', label ='Energy Consumed') |>dySeries('rs_energy_loss', label ='Energy Loss') |>dyRangeSelector() |>dyOptions(stackedGraph =TRUE, drawPoints =TRUE, pointSize =2) |>dyAxis("x", label ="Timeline") |>dyAxis("y", label ="Energy (Trillion Btu)")
3.3.4 Energy Overview by Transportation Sector
Code
rs_energy_consumed <-xts(x = energy_data_monthly$`Total Energy Consumed by the Transportation Sector (Trillion Btu)`, order.by = energy_data_monthly$Month)rs_energy_loss <-xts(x = energy_data_monthly$`Electrical System Energy Losses Proportioned to the Transportation Sector (Trillion Btu)`, order.by = energy_data_monthly$Month)dygraph(cbind(rs_energy_consumed, rs_energy_loss), main='Energy Consumed vs Energy Loss (Transportation Sector)') |>dySeries('rs_energy_consumed', label ='Energy Consumed') |>dySeries('rs_energy_loss', label ='Energy Loss') |>dyRangeSelector() |>dyOptions(stackedGraph =TRUE, drawPoints =TRUE, pointSize =2) |>dyAxis("x", label ="Timeline") |>dyAxis("y", label ="Energy (Trillion Btu)")